AFFINE CONNECTIONS, DUALITY AND 
DIVERGENCES FOR A VON NEUMANN ALGEBRA. 



ANNA JENCOVA 



Mathematical Institute, Slovak Academy of Sciences, 
Stefanikova 49, 814 73 Bratislava, Slovakia, 
j enca@mat . savba.sk 

Abstract. On the predual of a von Neumann algebra, we define a dif- 
ferentiable manifold structure and affine connections by embeddings into 
non-commutative L^-spaces. Using the geometry of uniformly convex Ba- 
nach spaces and duality of the L p and L q spaces for 1/p + 1/q = 1, we 
show that we can introduce the a-divergence, for a G (—1, 1), in a similar 
manner as Amari in the classical case. If restricted to the positive cone, the 
a-divergence belongs to the class of quasi-entropies, defined by Petz. 

1. Introduction 

The classical information geometry deals with the differential geo- 
metric aspects of families of probability densities with respect to a given 
measure \i. The theory, developed in (TJEj, has been already extended 
to the nonparametric case, where the manifold is modelled on some 
infinite dimensional Banach space, see I7|. 

One of the important results of Amari's classical (finite dimensional) 
information geometry ^ |2] deals with the structure of Riemannian 
manifolds with a pair of flat affine connections, dual with respect to 
the metric. For such manifolds, there is a pair (0, rj) of dual affine 
coordinate systems, related by Legendre transformations 

where if), <p are potential functions. A quasi-distance, called the diver- 
gence, is then defined by 



£>(0i,0 2 )=-0(0i) + ¥>(»&) -^0 



For manifolds of probability density functions, flat with respect to the 
±a-connections, the corresponding a-divergence belongs to the class 
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of Cziszar's /-divergences 

Sf{p,q) = J f(-p d P 

where / is a convex function. The /-divergences were generalized to 
von Neumann algebras by Petz in ^H] by means of the relative modular 
operator of normal positive functionals on M: 

where ^ is the vector representative of ip. On the other hand, Amari's 
construction of the a-divergence, starting from a pair of dual flat con- 
nections, was extended to the manifold of faithful positive linear func- 
tionals on a matrix algebra Ai n (C), [T3J EJ]- The aim of the present 
paper is to show that there is such a construction for a general von 
Neumann algebra. 

For a G (—1,1), the a-connections can be defined using a-embeddings 
into non-commutative L p -spaces, p = tz~. In this case, the a and — a- 
connections are defined on different vector bundles and their duality 
corresponds to the Banach space duality of L p and L q , l/p+ 1/q = 1, 
therefore this duality does not require a Riemannian metric. This was 
shown by Gibilisco and Isola in jH] ( see also [7] for the classical case). 
Here, the a-embeddings were used to define the a-connections on mani- 
folds of faithful density operators of a semifinite von Neumann algebra. 
The manifold structure, however, was not specified here, although some 
definitions of such a structure already appeared, see [TT| I2T| 122]. 

Another possibility is to use the a-embedding to introduce the mani- 
fold structure. Here the problem is, that the range of the a-embedding 
is in the positive cone of the L p -space which, even in the classical case, 
can have empty interior. This problem was avoided in in defining 
the a-embedding on the whole predual M* and not just on the positive 
cone. 

The a-connections are defined as the trivial connections in L p (M, <p) 
and the ±a -duality is just the Banach space duality. The ±a-embeddings 
define a pair of dual coordinates on M*. Using the fact that the L p 
spaces with p e (l,oo) are uniformly convex, it was shown that the 
dual coordinates are related by potential functionals, just as in Amari's 
theory. From this, we can define a divergence functional on L P (M, 4>). 

Via the a-embedding, the divergence in L p (M,(f)) induces a func- 
tional on M* x M*, which is called the a-divergence. We will show that 
if restricted to the positive cone, the a-divergence is exactly the Petz 
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quasi-entropy S ga , with 



9a(t) := + —~t -t 

1 — a 1 + a 1 — or 



l+ct 



We will further investigate the properties of the divergence in L p (M, </>), 
especially the projection theorems. These imply some existence and 
uniqueness results for the a-projections, which generalize the projec- 
tion theorems in pQ. 



2. Uniformly convex Banach spaces. 

We recall some facts about convexity and smoothness in Banach 
spaces, see [To] . 

Let X be a Banach space and let X* be the dual of X. Then for 
u G X* we denote (x, u) = u(x). Let K be a closed convex subset in X 
with nonempty interior, in particular, let be closed ball with radius 
d. Let S be the boundary of K. 

A supporting hyperplane of K is a real hyperplane x + H, containing 
at least one point of K and such that K lies in one of the two closed 
half-spaces determined by x + H. There is at least one supporting 
hyperplane through every boundary point of K. A boundary point 
xo G S is called a point of smoothness if exactly one closed supporting 
hyperplane passes through x , called a tangent hyperplane. We say 
that K is smooth if every boundary point is a point of smoothness. 
The space X is called smooth if K\ is smooth. 

A normed space is smooth if and only if the norm is weakly differen- 
tiable at each point except the origin. The weak derivative of the norm 
at xo in the direction y is given by $l(y,v Xo /\\ Xo \\) , where Ux /||x || is the 
unique point in the unit sphere of X*, satisfying (xq,v Xo /\\ Xo \\) = \\x \\ 
and 3ft denotes the real part. The tangent hyperplane to the sphere 
S\\ xo \\ at xo is xo + H, with 

H = {X G X , &(x, Uaro/Haroll) = 0} 

The set K is said to be strictly convex if every boundary point of K 
is an extreme point, equivalently, the boundary of K contains no line 
segment. In this case, each supporting hyperplane meets K in exactly 
one point. 

A reflexive Banach space is smooth if and only if its dual X* is 
strictly convex, that is, the unit ball in X* is strictly convex. 

The space X and its closed unit ball, are said to be uniformly convex 
if for each e, < e < 2 there is a 5(e) > such that ||a;|| < 1, \\y\\ < 1 
and \\x — y\\ > e always implies that |||(x+y)|| < 1 — 8(e). The function 
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5(e) is called the module of convexity. Every uniformly convex space 
is strictly convex and reflexive. 

There is also a stronger notion of smoothness, dual to uniform con- 
vexity. The space X, and its norm, are said to be uniformly smooth if 
for each e > there is an 77(e) > 0, such that ||x|| > 1, ||y|| > 1 and 
\\x — y\\ < 77(e) always implies \\x + y\\ > \\x\\ + \\y\\ — e\\x — y\\. 

A normed space X is uniformly smooth if and only if its norm is uni- 
formly strongly differentiable. In particular, every uniformly smooth 
normed space is smooth. A Banach space X is uniformly convex (uni- 
formly smooth) if and only if X* is uniformly smooth (uniformly con- 
vex). 

We will also need the following two results by Cudia [H]. 

Theorem 2.1. Let S resp. S' be the unit sphere in X resp. X* . The 
norm is (uniformly) strongly differentiable in S if and only if the map 
v '. x 1 — > v x is single valued and (uniformly) continuous from the norm 
topology on S to the norm topology on S' . 

Let us now define the map F : X —>■ X* by 



Theorem 2.2. Let the Banach space X be uniformly convex and let 
the norm be strongly differentiable. Then F is a homeomorphism of X 
onto X* (in the norm topologies). 



Let M be a von Neumann algebra and let be a faithful normal 
semifinite weight. We denote the set of y e M satisfying <f>(y*y) < 
00 and M the set of all elements in H NX, entire analytic with 

respect to the modular automorphism o~f associated with <p. We also 
denote the GNS map by 3 y \— > rj^(y) e H^. 

Let 1 < p < 00 and let L P (M, 0) be the non-commutative L p space 
with respect to 0, as defined by Araki and Masuda in |H Ej. The 
elements of L p (M,<p) are closed operators acting on the Hilbert space 
Hfa satisfying 



is finite. Then L p (M,<f>) with the L p -norm is a Banach space. Let 
1 < p < 00, then L p (M, 0) is uniformly convex and uniformly strongly 

4 




3. Non-commutative L p -spaces. 



for all y G Mo, such that the L p -norm 



T|U = { sup \\\T\P/%(x)\\} 2 / p 



xeMo,||a;]|<l 



differentiable. The dual space L*(M, 0) is L q (M, 0), with 1/p+l/q = 1, 
where the duality is given by 

(1) (T,T% = lim( Tr^),T^)) 

y-*l 

where T G L p (M, 0), T' G L q (M, 0). The limit is taken in the *-strong 
topology with restriction y G M , ||y|| < 1. 

Each T G L p (M, 0), 1 < p < oo, has a unique polar decomposition 
of the form 

T = 

where ^ G M+, w G M is a partial isometry, such that the support 
projection s(0) = u*u and is the relative modular operator, see 
Appendix C in 4J for definition and basic properties. On the other 
hand, each operator of this form is in L p (M, 0). The positive cone 
L+(M, 0) is the set of positive operators in L p (M, 0) and we have 

L+(M, 4) = 1>eM+} 

The identity 

(2) (p(au) = (uA^,a% 

for a G M gives an isometric isomorphism of M* and L±(M, 0). Simi- 
larly, L,2(M, 0) is isomorphic to by 

where ^ is the vector representative of ip in the neutral positive cone 
in H$. 

If is a different n.s.f. weight, then there is an isometric isomorphism 

r p (0, 0) : L p (M, 0) -> L P (M, 0) and 

(3) (T,T% = (r p (0,0)T,r g (0,0)T% 

holds for all T G L P (M, 0) and V G L q (M, 0). 

A bilinear form on L P (M, 0) x L q (M, 0) is defined by 

[T,T'] = (T,T'*>^ T G L p (M,0), T' G L ? (M,0) 

If T fc G L Pk (M, 0), ^ fc = 1/r, then the product T = Ti...T n is well 
defined as an element of L r (M, 0) and 

IIT1I <• ||7~' II I it -1 || 

IP || r _i 1 1 - 1 1 1 1 pi • • • IKnllpn 

If r = 1, then 

(4) [Tl-.t^ := [T,i]^=[r 1 ...r fcj r fc+1 ...r n ] = 

= Pft+i • • • T^Ti . . . Tfe]^ 
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for each 1 < k < n — 1 and 

(5) I pi . . . Tn]^ < \\Ti\\ Pl . . . ||T n || Pn 

4. THE a-EMBEDDINGS AND AFFINE CONNECTIONS 

Let M be a von Neumann algebra and let be a faithful normal 
semifinite weight. 

For 1 < a < 1, we define the non- commutative a-embedding by 

t*: M, - L p (M,<j>), P=j^ 
uj ^ puA 1 ^ 

where uj(a) = <p(au), a G M is the polar decomposition of uj. It is 
clear from uniqueness of the polar decompositions that & a is bijective. 
Moreover, it maps the hermitian (that is, to (a*) = uj(a) ) elements in 
M* onto the real Banach space L p (M,(f>) of self-adjoint operators in 
the Lp-space and M+ onto the positive cone L+(M, 0). 

If ij; is a different f.n.s. weight, then the space L p (M,ifj) is identified 
with L p (M, 0) by the isometric isomorphism T p (ip, 0). The correspond- 
ing a-embeddings are related by 

We denote by M. a the set M* with the manifold structure induced 
from Due to the above isomorphism, the manifold structure does 
not depend from the choice of 0. For uj G M*, £f,(uj) G L p (M, 0) will 
be called the a-coordinate of uj. The — a-coordinate is an element of 
the dual space L q (M, 0), 1/p+l/q = 1. Moreover, for o>2 G M* and 
a n.s.f. weight -0, we have by 

(6) (iiM,it a (uj 2 ))^ = (r p (^0)<(w 1 ),T,(V,^ a M>^ = 

= (<(Wi)/_ (^ 

In the sequel, we will just write 4, instead of We will say that £ a {uj) 
and £_ a (u;) G L q (M, 0) are dual coordinates of u G M*. 

The trivial connection in L p (M, 0) induces a globally flat affine con- 
nection on the tangent bundle Tj\4 a , called the a-connection. Let us 
recall that there is a one-to-one correspondence between affine con- 
nections and parallel transports on Tj\4 a . If the connection is glob- 
ally flat, the parallel transport is given by a family of isomorphisms 
U x>y : T x (M a ) -> T y (M a ), x,y G M a , satisfying 

(i) U X;X = Id, 

(fi) Uy Z Ux,y U x ^ z 
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In our case, the tangent space T x (A4 a ) can be identified with L p (M, 0) 
and the map U X;V is the identity map for all x,y G M. a . We define 
the dual connection as in [7] , that is, a linear connection on the cotan- 
gent bundle T*j\4 a , such that the corresponding parallel transport U* 
satisfies 

(w>t£v(«0)* = ( U yA v )> w )<f> = ( v , w )<t> 

for w G (T x (M a )Y = M M >0) and v G T y {M a ). Obviously, U* is 
the trivial parallel transport in L q (M,(f)), hence the dual of the a- 
connection is the — a-connection. 



5. Duality. 

Let to G M*. We will show how uj is related to its dual coordinates. 

Proposition 5.1. Let uj G M*, uj(o) = ip(au) be the polar decomposi- 
tion and let ip u {a) = ip(u*au). Then 

pqtp u (a) = (£ a (uj),a*£^ a (uj))^, a G M. 
Proof. We have from (J2J) and (UJ) that 

ip u {a) = (A^,u*a*u) <t) = [A^u*au]^ = [A^A^u* au}^ = 

□ 

The L p spaces for 1 < p < oo are uniformly convex and uniformly 
smooth, therefore we can use the results of Section 

The map which sends the a-coordinate x = £ a (uj) of uj onto the dual 
coordinate: 

is called the duality map. It is easy to see that for x G L P (M, 0) we 
have 

( 7 ) v x/w*\\ P = \\^\\i~ p ^ 

and x is the unique element in L q (M, 0), such that 

(8) 11-113 = 11-115 and *t(x,S;)+ = pq\\-\\*. 

Proposition 5.2. The duality map is a homeomorphism L p (M, 0) — > 
L 9 (M,0). 
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Proof. Clearly, puA/^ i— > quAj^ is continuous at 0. Further, let F be 
the map defined in Section |2] and i ^ 0, then we have from (j7J) 

pP 

The statement now follows from Theorem 12.21 

□ 

Let us define the function : L p (M, 0) — > R + by 
*p(^) = ?I|-|Ip = ?¥'(1)> 
where x = puA^J^. Then we have 

Proposition 5.3. \& p is strongly differentiable. The strong derivative 
at x is given by 

D y %(x) = &(y, x)t, y G L P (M, 0) 

where x is the dual coordinate. Ifl/p+ 1/q = 1, then 

= 3?(a;,x) -^ p (a;) 

Proof. We have from the uniform smoothness of L p (M, 0) that the 
norm is strongly differentiable at all points except x = and 

A/IWIp = &(2/> v x /\\ x \\ p )<t> 
It follows from (J2J) that for x ^ 0, 

D y ^ p {x) = g||-||p ,_1 ^(y,tvlMI P )</> = ^{y^)<t> 

As p > 1, the function ||-||£ is strongly differentiable at x = and 

D„* p (0) = = R(y, 0)^ 
The last equality is rather obvious. 

□ 

In the commutative CctSG, clS well as on the manifold of positive defi- 
nite n x n matrices, *& p is the potential function in the sense of Amari, 
see [T] and [TS1 [TO] . In general, it is not twice differentiable, but the 
above Proposition shows that the Legendre transformations, relating 
the dual coordinate systems, are still valid. It will be also clear from 
the results of the next Section, that 

sup (&(y, x)4, - * p (y)) 

yeL p (M,<f>) 

hence ^ q is the conjugate of the convex function \l/ p . 
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6. Divergence in L p (M,<f)). 
Following [1 , the function D p : L P (M, <p) x L P (M, <p) -> R+, denned 

by 

D p (a;,y) = V p (x) + W 9 (£) - K(x,y> 
is called the divergence. It has the following properties. 

Proposition 6.1. (i) Let f p (t) = p + gt p — pqt. Then 

(9) ^»)>ll^ll?/p({nr) 

/or a// x, y G L P (M,4>), where for y = 0, we take the limit 
lim^o t p f(s/t) = /or a// s. In particular, D p (x, y) > /or a// 
x,y E L P (M, 0) and equality is attained if and only if x = y. 
(ii) D p is jointly continuous and strongly differentiable in the first 

variable. 
(hi) D p (y,x) = D q (x,y) 

(iv) D p (x, y) + D p (y, z) = D p (x, z) + £(x -y,z- y)^ 

Proof. The statement (ii) follows from Proposition 15 M\ (hi) and (iv) 
follow easily from the definition of D p . We will now prove (i). If y = 0, 
then D p (x,y) = *& p (x) > 0. Similarly, if x = 0, D p (x,y) = ^ q (y), 
which is equal to the right hand side of (JOJ). 

Let now x ^ 0, y ^ and let t = ||x||p/||u|| p . Then by (J7J) 

j/>0 = tylHIjT^T.VlMlp)* 
P 1 

Let || ?/ ||p = r and let 5 r be the sphere with radius r in L p (M, 0). Then 
y, | G SV- From Sectional the tangent hyperplane y + H to S r at y 
is given by ify/r^ = r, S r lies entirely in the half-space given by 
^(ZjVy/r)^ < r and y is the unique point of S r contained in y + H. 
Hence, 

D p (x,y) > %(x) + * q (y) ~ tpq\\~\\ p p = \\ l \\ p p f P (t) > 0, 

p F p y 

where equality is attained in the hrst inequality if and only if ~ = y, 
and in the second inequality if and only if t = 1. □ 

We will also need the following lemma. 

Lemma 6.1. Let y G L p (M, 0), d > and let 

U y>d := {x G L p (M,(f)), D p (x,y) < d} 
Then U y ^ is weakly closed, convex and contains no half-line. 



Proof. It is easy to see that D p is convex in the first variable, therefore 
the set U y> d is also convex. Next, let {x\} be a net in U y ^ converg- 
ing weakly to some x G L p (M, 0) (it is in fact sufficient to consider 
sequences). Then < D p (x\,y) < d and we may suppose that the net 
d\ = D p (x\,y) has a limit in [0, d], using a subnet if necessary. We 
have 

limd A = V q {y) + lim{g|| — \\ p p - (x x ,y)^,}. 

It follows that liniA exists. Furthermore, for u in the unit sphere 

ofL 9 (M,0), 

\(x,u)<j,\ = lim|(x A ,w)^| < lim||a;A||p 

A A 

and hence \\x\\ p < liniA ||#a||p- We therefore have 

x 

D p {x,y) = ty q (y) + q\\-\\ p „ - (x,y)^ < \imd x < d 

J) A 

and Uy t d is weakly closed. 

Finally, let h ^ and let x t = x + th, t > be a half- line in 
L P (M ,</>). For y = 0, we have D p (x t ,0) = q\\f\\ p p . If y ^ 0, then by 
Proposition 16.11 (i) , 

^,y)>ll-ll%(lnf) 
v IMI 

In both cases, the right-hand side goes to infinity as t — > oo. Therefore 
Uy^ can contain no half-line. 

□ 

7. Dp-PROJECTIONS. 

Let C be a subset in L p (M, 0), y G L p (M,(f>). If there is a point 
x m G C, such that 

D p (x m ,y) = mxnD p (x,y) 

then x m will be called a .Dp-projection of y to C. In this section, we 
prove some uniqueness and existence results for D p -projections. 

Proposition 7.1. Let C be a convex subset in L P (M, 0), y G L P (M, 0) 
and x m G C. TTie following are equivalent. 

(i) D p (x m ,y) = min xeC D p (x,y) 

(ii) ?/ — x m is in the normal cone to C at x m , that is, 

$l{x - x m ,y - Xm)^ < 0, Vx G C 

(iii) D p (z, y) > D p (x, x m ) + £> p (x m , w), Vx G C 
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If such a point exists, it is unique. 



Proof. Let x m be a point in C satisfying (i) and let x G C. Then x t = 
tx + (l — t)x m lies in C for all t G [0, 1] and thus D p (x t , y) > D p (x m , y) 
on [0, 1]. We have from Proposition 15.31 

< -^D p (x t , y) \ t =o = U(x - x m , x m -y)<t> 
which is (ii). Further, from Proposition 16.11 (iv) 

$t(x x m , x m y) (/> Dp{%, y) Dp\Xj x in ) Dp(x m , ?/), 

hence (ii) implies (iii). Finally, let x m satisfy (iii), then we clearly have 
D p (x m ,y) < D p (x,y), for all x G C. 

To prove uniqueness, suppose that x\ and x 2 are points in C, satis- 
fying (iii). Then 

D p (xx,y) > Dp(xi,x 2 )+Dp(x 2 ,y) > D p (x 1 ,x 2 )+D p (x 2 ,x 1 )+D p (x 1 ,y). 

It follows that D p (x\, x 2 ) + D p (x 2 , xi) < and hence x\ = x 2 . □ 

Proposition 7.2. Let C be a weakly compact subset in L p (M, 0) and 
y G L p (M,(p). Then there exists a D p -projection of y to C. 

Proof. For some d > 0, the set U y ^ has a nonempty intersection with 
C. By Lemma l6~Tt the sets Uy^nC are weakly closed. The intersection 
of these sets for all such d is therefore nonempty and is equal to some 
Uy, P H C. Then p = min xg c D p (x, y) and all the points in U y>p fl C are 
Dp-projections of y in C. □ 

Proposition 7.3. Let C be a weakly closed, convex, weakly locally 
compact subset in L p (M,(p). Then for each y G L p (M,4>) there is a 
unique D p -projection to C . 

Proof. Similarly as in the proof of previous Proposition, the set Uy^nC 
is non-empty for sufficiently large d > 0. By Lemma 16.11 this set is 
convex and weakly closed. As C is weakly locally compact, U Vt d fl C is 
also weakly locally compact. By [T3], pp. 340, a closed convex locally 
compact subset in a locally convex space is compact if and only if it 
contains no half-line. It follows that U y> d fl C are weakly compact and 
the intersection of all such nonempty sets is therefore nonempty. Each 
point in this intersection is a Dp-projection of y to C. By Proposition 
17.11 such a point is unique. □ 

Under the hypotheses of the above Proposition, we can define the 
map y \— > x m , which sends each point y to its unique Dp-projection in 
C. 

n 



Proposition 7.4. Let C be a weakly closed convex weakly locally com- 
pact subset in L p (M,(j)) and let G C. Then the Dp-projection is 
continuous from L p (M, 0) with its norm topology to C with the relative 
weak topology. 

Proof. Let {y n } be a sequence in L p (M,(f>) converging in norm to y. 
Let x 7 ^ be the unique Dp-projection of y n and x m be the unique Dp- 
projection of y in C from Proposition 17.31 We have to prove that x^ 
converges weakly to x m . 

Let k > be such that ||y n || p < k for all n. Inserting x = in 
Proposition 1)6. ip . we get 

•Em ) 

and therefore by (JHJ), H^mllp < ||y||p < Similarly, ||a;™ || p < \\y n \\ P < k 
for each n. 

As the duality map is continuous, we have y~ n — > y in L q (M,<f>). 
Further, we have from joint continuity of D p that \im D p (y, y n ) = 
\im D p (y n ,y) = D p (y,y) = 0. For sufficiently large n, 

d n := D p (x n m ,y n )= inf D p (x,y n ) = 

x £ 1 1 x | |p "^~~k 

inf {Dp(x, y) + £> p (j,, y n ) - K(x - y, - y)4 < 

3^ £ C, || IC lip ^ A? 

< D p (x m , y) + D p {y, y n ) + 2k\\y-y n \\ q <d + e 
where d := D p (x m ,y). Further, 

D p (x n mJ y) = D p (x n m ,y n ) + D p (y n ,y)-^(x n m -y\y-y n }^< 
< d n + D p (y n ,y) + 2k\\y-y n \\ q <d + 2e 

Hence for sufficiently large n, G U y ^+2e H C. As in the proof of 
Proposition 17.3) these sets are nonempty weakly compact sets and 
therefore {x 7 ^} contains a weakly convergent subsequence. On the 
other hand, any limit of such subsequence has to be in U y ^+2s H C 
for all e and thus also in f] £ U y> d+2e H C. This intersection contains a 
single point x m , it follows that converges weakly to x m . □ 

8. The ^-divergence in M+ 

Let a G (—1,1) and let p = j^—. The divergence in L P (M, 0), defines 
the functional S a : M* x M* — > by 

5 a (wi,w 2 ) := A>(4(^i),4(^2)) = 

= ^(1) + p0(l) - pg»<uA$, vA^}), 
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where Ui(a) = ip(au) and u; 2 (a) = tfj{av) are the polar decompositions. 
It is called the a-divergence. It follows from (JHJ) that S a does not 
depend from 0. In particular, if ip is faithful, then 

where ^ is a vector representative of ^. It follows that if tp, ip G M+, 
^ is faithful and A^ . = J A-Ea is the spectral decomposition, then 

S a ((p,ip) = (g p (A^ u )^,^) = J g p {X)\\E x ^\\ 2 

where g p (t) = p + qt — pqt l / p . Hence, in this case the a-divergence is 
equal to the quasi entropy Sg , defined by Petz in [THl HE]- We will 
show that this is true on the whole of x M+. 

Lemma 8.1. Let (p, ip G M+ ; u,v G M be partial isometries satisfying 
u*u = s((p), v*v = s(ip). Let p, q > 1 be such that A + i = 1. TTten 

(10) «Aj;;>^ = ( A^% Alf*u*v^ ) 
where ^ a vector representative ofijj. 

Proof. Let 1/p < 1/2. We have 

, = lim( AjJ-^VuA^Cy) , A^(y) ), 

with ?/ G M , ||y|| < 1. For y G A^,, 

(11) ( aJ3t 1/ vua^(i/) , A ;%( y ) ) = 

= ( j^vAT^dv) > j ^ A i(r i/Pu * uA i{W^) ) = 

here we have used that J| ^.J^,^ = = ^(A^), the support of 
A^ 5 0. Let t G M, then 

^ A ^r VwA ^(^) = = 

S ^m v * ^l^L^^iv) = S^v*(Dip v : D<p u )_ t ur)4(y) = 
y*u*(Di) v : Dcp u )*_ t v^ 

where y u (a) = (p(u*au) and uA^^u* = A^, by (C.8) in 0]. From 
this, we have 

( V*U > J^vATi^^t^iv) ) = ( : Dip u )_ t uyy*^ , ) = 

= ( A^uyy'fc , ^ ) = ( uA^yy'fc , «fc ), 
13 



where we have used (C.5) and (C.8) of [4 . It follows that for z = it, 

(12) ( y*^ , J^A^J" VmA^^(?/) ) = (y% , y*A z ^u*v£ f ). 

By Lemma 3.1 in [Uj, both sides of (fT2*|) are holomorphic for < $tz < 
1/2 and continuous for < §lz < 1/2. The equation (|1(J|) holds for 
l/p<l/2by ()11|) and analytic continuation of (fT2*|) . 

Let now 1/q < 1/2. We have by the first part of the proof 

, «Aj;;>^ = ) = ( S^tiX* , A^S^fc ) 

we have used the equations (C.14) JL„ 2 = J^-ni anc ^ (/^) ^m^^vi^m^m, 
A" 1 from Appendix C in jl] . □ 

It follows that S a (ip,if)) = Sg p (<p,ip) for all positive normal function- 
als 99 and ^. The function g p , 1 < p < 00 is operator convex and it 
follows from the results in fIF that 

(i) S a is jointly convex on x M+ 

(ii) S a decreases under stochastic maps on M+ x M+ 

(iii) S a is lower semicontinuous on M* + xjF(M+) endowed with the 
product of norm topologies, where JF(M+) denotes the set of 
faithful elements in M+. 

The following properties of the a-divergence are valid on M* x M* 
and are immediate consequences of the results of Section El 

(i) Positivity 

S a (<P,ip)> Wift(l^!r)>0 

and S a (p,ip) = if and only if ip = ip (here || • ||i is the norm 
in M*). 

(ii) S a (ip,i/;) = S- a (ip,ip) 

(iii) generalized Pythagorean relation 

Notice that the Pythagorean relation (iii) is a generalization of the 
classical version in pQ, which says that equality is attained if and only 
if the a-geodesic connecting ip and <p is orthogonal to the — a-geodesic 
connecting ip and a. 

We also define the a-projection of p G M* onto a subset C C M* as 
the element in C that minimizes S a (-,p) over C. We will say that a 
subset C C M* is a-convex if £ a (C) is convex. The next Proposition 
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is a generalization of the results in P3 |2] and follows directly from 
Proposition 17.11 

Proposition 8.1. Let C C M* be a-convex and let if) G M* ; ip m G C . 

The following are equivalent. 

(i) ifi m is an a-projection of if) in C. 

(ii) For all a G C, 

S a {cr,ip) > S a (Lp m ,ip) +S_ a (Lp m ,a) 

(iii) The curve x t G L q (M,cf)), 

lies in the normal cone to £ a (C) at £ a (f m ) for all t > (Note 
that iZa(xt) is the —a-geodesic connecting Lp m and if>.) 
If such a point exists, it is unique. 

The topology induced by the a-embedding from the norm, resp. the 
weak topology in L p (M, cf>) will be called the a-, resp. the a-weak 
topology. The following Proposition is also immediate from Section [7| 

Proposition 8.2. Let CcM, and let if) G M*. 

(i) // C is a-weakly compact, then there exists an a-projection of 
ip in C . 

(ii) // C is a-weakly closed, a-convex, a-weakly locally compact, 
then there exist a unique projection of if) in C. 

(iii) If C is as in (ii) and, moreover, G C, then the a-projection 
is a continuous map from M* with the a-topology to C with the 
relative a-weak topology. 

Example 8.1. Let C be an extended a-family, generated by a finite 
number of positive elements, that is, there exist x%, . . . , x n G L p (M, (f>), 
such that 

n 

£ a {C) = U >0,i = 1, . . . ,n} 

1=1 

It follows from Proposition 18.21 fiii) that we have an a-projection from 
M* to C, which is continuous in the a-topology. 

9. The case a = 0. 

Let a = 0, p = q = 2. The space L 2 (M, </>) can be identified with the 
Hilbert space and the dual pairing (■,■)</, is the inner product (•, •) 
in Hfy. Through this identification, the 0-embedding becomes the map 

uj i — > 2-u^ 
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where u(a) = (p(au) is the polar decomposition of to and ^ is the 
unique vector representative of (p in the neutral positive cone V in H^. 
Hence the O-embedding maps M* bijectively onto H^. In this case, the 
duality map is the identity on and the potential function is 

Therefore, the potential function is C°°-diferentiable and 

D 2 y)Z cp 2 (x) = M{y,z) Vxetf* 

It follows that (f2 defines a Riemannian metric in the tangent bundle 
TA4 Q , which corresponds to the real part of the inner product, induced 
from the O-embedding. In the matrix case, this metric was studied on 
density matrices and it was shown that it coincides with the Wigner- 
Yanase metric, see 0. 

Up to multiplication by 2, the restriction of £o to the positive cone 
corresponds to the identification of the positive normal functionals 
with elements in V proved by Araki in 3j- It has been also shown that 
this identification is a homeomorphism M+ — > V. It follows that the 
relative 0-topology is the same as the relative Z^-topology in M+. 

The -D 2 -divergence in is 



Dz(x,y) = 




hence the /^-projection corresponds to minimizing the Hilbert space 
norm. This means, in particular, that there is a unique /^-projection 
onto every closed convex subset of H^. 
The O-divergence in M* becomes 

S (ui,u 2 ) = 2||< (/ , - v^\\ 2 

On the positive cone, the O-divergence generalizes the classical Hellinger 
distance. 

10. Topologies induced in M+ 

In this section, we study various topologies induced by the a-embeddings 
in M+. First of all, we see from Proposition 15 .21 that the +a- and —a- 
topologies are the same. Let now ip, if) 6 M+ and let £ a (<f) = x. 
•^(VO — V- By Proposition 15. II and (0), we have for a G M, 

\ip(a) - ip(a)\ = \{x,a*x)^- {y,a*y) <t> \ = 

= + y),a*(x -y)}^+ ((x-y),a*(x + y)) <l> \ < 




a \\(\\ x + y\\ P \\ x - y\\q + \\ x - y\\ P \\ x + y\U) 
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It follows that the map t~ l : L+(M, 0) — > is continuous relative 
to the norm topologies. Hence the a-topology is stronger than the 
Li -topology in M+. 

Since the a-divergences can be seen as quasi-distances in M+, we 
will consider the topology induced by S a , which will be called the S a - 
topology. The S a topology is given by the base of neighborhoods 

01ie):={^M+ S a (<p,t/>) < e} 

for tj) G M+, e > 0. Because the functions L p (M, 0) 3 x i— > D p (x, y) G 
R + are continuous for each y, the S^-topology is weaker than the a- 
topology. 

Lemma 10.1. Let (p,ip G M+ and let -1 < a < (3 < 1. TTien 

(1-/3)^,^) < (l-a)5 a (^,V) 
(l + a)5 a (^,V) < (l+^SpM) 

Proof. The proof is essentially the same as in the classical case, see for 
example [TB] . 

Let us consider the function 

F t (a) = t a - at + a - 1 a G (0, 1) 

Then F t is convex on (0, 1) for all t > 0. It follows that 

F f (l)-F f (q) < F f (l)-F f (6) 
1 - a ~ 1-6 
for all < a < b < 1 and t > 0. As F t (l) = for all t, we get that the 
function is increasing on (0, 1). Let now p = and put a = l/p, 
then the function 

is decreasing on (0, oo). Hence we have for < p < p' < oo 

and the first inequality follows. The second inequality is obtained from 
the first and from S a ((p,if)) = S- a (if),ip). □ 

From the last Lemma, we get for (p G M+, —l<a<(3<\ and 
d > 0, 

l —^d) c o^, d) c o«(^, 

1 — a 1 + p 

hence the S^-topologies are the same for all a G (—1,1). In particular, 
these are the same as the So-topology, which, by Section El is the same 
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as the O-topology. It follows that on the positive cone, the topology 
induced from S a coincides with the Lx-topology . 

11. The unit sphere. 

The a-embedding maps the unit sphere S in M* onto the sphere S p 
with radius p in L p (M, 0). The duality map x i— > x maps S p onto the 
sphere S' with radius q in the dual space L q (M, 0). From (J7J), we have 
that for x G S p , 

(13) x = qv x/p 

Proposition 11.1. The duality map S p 3 x *—>■ x £ S' q is uniformly 
continuous. 

Proof. The statement follows from (|T3*j) and Theorem 12.11 □ 

Further, there is a unique tangent hyperplane x + H x through x, 
where H x is given by the condition 

&(v,x)4 = q^(y,v x/p ) (j , = 

Hence there is a splitting L P (M, 0) = H x © [x] and, similarly as in [7j, 
there is a continuous projection n x : L p (M, 0) — > H x , given by 

x I 

n x (y) = y - $l(y, v x /\\ x \\ p )$- = y fft(y, x)^x, 

p pq 

which is obtained by minimizing the L p -norm. 

As the norm is strongly differentiable, the unit sphere can be given 
the structure of a differentiable submanifold T> a in Ai a . If if) £ T> a 
has the a-coordinate x G S p , then the tangent space T x (T> a ) can be 
identified with the tangent hyperplane H x and ir x can be used to project 
the a-conection onto TT> a . But, even in the classical and the matrix 
case, the projected connection is no longer flat. Hence, it does not 
define a divergence, but nevertheless, we can use the restriction of S a 
as a quasi-distance on S. This restriction has the form 

Safa, a*) = pq(l - ^uA^, v^%) 

which corresponds to the definition of the a-divergence in PQ for prob- 
ability densities and in ^2] for density matrices. 

Let us now consider the topologies induced on the set of states 
S + C M+. From jT3] pp. 354, we have that the weak and the strong 
topologies coincide on the unit sphere of a uniformly convex space, 
hence these coincide on S p . It follows that the relative a-topology and 
the a-weak topology are the same on S. 

18 



Let now <p, ip G S and let l a ((p) = x, £ a (ip) = y. Then x,y & S p C 
L p (M, 4>) and 

||^(- + -)\\ P > ^-mx + y^\ = |1 - 

Therefore if D p (x,y) < 2pq5(e), where 6(e) is the module of con- 
vexity, then + -)||p > 1 — 6(e) and uniform convexity implies 
that ||x — y\\ p < pe. It follows that for each e > 0, the set S p fl 
£a(O a (ip, 2pq5(e/p))) is contained in the strong neighborhood S p n \ \x — 
y\\ p < e. Therefore, the ^c-topology coincides with the a-topology on 
S. We have proved the following 

Proposition 11.2. The topologies on S + , inherited from the a-topology, 
a-weak topology and S a -topology coincide with the Li-topology for all 
a e (-1,1). 

Corollary 11.1. The restriction of S a to S + x S + is continuous in the 
Li-topology. 
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